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(54) Image information processing apparatus and its method 



(57) Image processing apparatus and method in 
which image data is inputted, the inputted image data is 
divided into blocks constructed by a plurality of pixels, a 
motion of the image data is detected every block, and at 



least the image data of a first object and the image data 
of a second object are classified from the image data in 
accordance with the detection result. 



FIG. 8 



101 



IMAGE 
INPUT 
UNIT 



102 



FOREGROUND 
AREA 

EXTRACTION 
WIT 



105 



TEXTURE 
FORMING 
UNIT 



107 



TEXTURE 
ENCODING h 
UNIT 



108 



MOTION 

COMPENSATION 
UNIT 



110 



111 



108 



109 



BOUNDARY 
AREA 

EXTRACTION 
UNTT 




SHAPE 

INFORMATION 

FOXING 

UNIT 




SHAPE 

INFORMATION 

etCODING 

UNTT 







MULTIPLEXING 




RECORDING 


UNIT 




UNIT 



104 



BACKGROUND 
AREA 

EXTRACTION 
UNIT 



CM 
< 

CM 
h- 

CO 
CO 

o> 



Q. 
LU 



Prirasd by Xerox (UK) Business Services 
2.16.7/3.6 



3NSOOCID: <EP 0933727 A2J_> 



1 



EP 0 933 727 A2 



2 



Description 

BACKGROUND OF THE INVENTION 

Field of the Invention 5 

[0001] The invention relates to an image processing 
apparatus and its method for performing a separating 
and synthesizing process of a background and a fore- 
ground for a motion image. 1 

Related Background Art 

[0002] In recent years, in association with the spread 
of a personal computer in which an advanced CPU is 
installed, a request for an edition of a motion image 
which is executed on the personal computer has been 
increasing. As examples of an editing work, there are 
various works such as exchange of the time order 
between frames or fields, wiping, dissolving, mosaic, 
insertion of another image, and the like. Instead of 
processing an image on a frame or field unit basis, a 
technique for separating an image into meaning units 
(hereinafter, called objects) such as objects, back- 
grounds, characters, or the like in the image and per- 
forming an individual process is also being improved. By 
changing an encoding system or encoding parameter 
every object, a high efficient transmission or recording 
in which error withstanding performance is enhanced 
can be also performed. To perform the individual proc- 
ess on an object unit basis, the object has to be 
extracted from a frame or field image. 
[0003] An object extracting method which has conven- 
tionally been used with respect to a motion image is 
called a "blue back". According to the blue back, a blue 
background is prepared in a studio set or the like and 
the blue portion is replaced with another background 
image by a switcher. As a method which is frequently 
used in a still image, a method of detecting and extract- 
ing an edge portion, a method of extracting by providing 
a threshold value for a signal level, or the like has been 
known. However, the conventional method using the 
blue back has a problem such that if a blue picture pat- 
tern exists in an area other than the background portion, 
such an area is erroneously recognized as a back- 
ground. There is also a problem such that it is neces- 
sary to prepare studio equipment Even in a digital 
process which can solve the above drawbacks, since it 
takes a long time for arithmetic operations, there is a 
problem such that in case of adapting to a motion 
image, real-time performance has to be sacrificed. 
[0004] On the other hand, in association with the 
recent realization of high fineness of an image, the con- 
tents of information which the image has are changing. 
For example, there is a case where characters are 
superimposed onto a motion image and a resultant 
image is transmitted or a, case where another image is 
superimposed to the motion image and the resultant 
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image is transmitted.. An amount of information which 
can be transmitted per unit time is increasing. The 
necessity to extract only necessary portions from a plu- 
rality of information and to store or re-edit them will fur- 
ther increase in future. 

[0005] In case of separating a background object and 
a foreground object, however, there is hardly a case 
where a boundary portion of the object is clearly sepa- 
rated on a pixel unit basis. A blur area which is caused 
w due to optical characteristics of an image pickup device 
exists and pixels in such a blur area are in a state where 
signal levels of the background and foreground are 
mixed. Such a situation is particularly typical with 
respect to a motion object. It is, therefore, important 
is how to handle such a vague boundary area for a proc- 
ess of the object unit. 

[0006] The problems to be solved by the invention will " 
now be described in detail hereinbelow with reference 
to the drawings. 
20 [0007] Fig. 1A shows an example of an original image 
which is used to separate a foreground object and a 
background object. A part of the image is divided into 
small blocks and explanation will now be made. Refer- 
ence numeral 1001 denotes a block of the foreground 
25 object, 1002 a block of a boundary portion, and 1003 a 
block of a background portion. Figs. 1B to ID enlargedly 
show the blocks 1001 to 1003. 
[0008] As will be understood from Figs. 1 B to 1 D, val- 
ues which are different from a value (data in the block 
30 1001) which the foreground object has and a value 
(data in the block 1003) which the background object 
has exist in the boundary block 1002. Fig. 2 shows the 
luminance level of the image of this block on a line A-A\ 
In this example, the level from the luminance level of the 
35 foreground to the luminance level of the background 
smoothly changes. 

[0009] According to the object extraction by the blue 
back, the value of the block 1003 corresponds to blue 
and the data at this level is removed as a background 
40 portion. 

[0010] Fig. 3A shows a synthesized image obtained 
by superimposing another background into the back- 
ground portion removed as mentioned above. Figs. 3B 
to 3D are enlarged diagrams of the blocks 1001 to 1003. 
45 As will be understood from the block 1002 in Fig. 3C, 
even H the background object is replaced, a boundary 
area is in a state where data of the previous object is 
partially included. Therefore, discontinuous points are 
generated. Fig. 4 shows such a situation by the lumi- 
50 nance level. In such a synthesized image, unnatural- 
ness is conspicuous in an edge portion. Although a 
deviation of the luminance level causes a feeling of 
wrongness in the brightness of the edge, in case of a 
deviation of a color difference level, the edge is colored 
55 and the unnaturalness further increases. 

[0011] To avoid such unnaturalness, a method 
whereby only a complete foreground object portion is 
extracted and synthesized to another background 
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object without extracting data of a boundary area is also 
considered. Fig. 5A shows an example of such a case. 
Figs. 5B to 5D eniargedly show the blocks 1 00 1 to 1 003. 
Since the data in the boundary area is not used, the 
foreground and background are clearly separated in the 
block 1002. Fig. 6 shows such a situation by the lumi- 
nance level. According to an image obtained by merely 
simply superimposing the two objects as mentioned 
above, it feels as if an outline portion is visually empha- 
sized. In this case as well, unnaturalness of the synthe- 
sized image is conspicuous. 

[0012] A method of filtering the edge is also consid- 
ered as an improved method of the above example. 
Figs. 7A and 7B show examples in which a filtering 
process is performed to the image of Fig. 6. According 
to those examples, although the unnaturalness of the 
outline portion is reduced, since a width of boundary 
area to decide a degree of blur is unknown. Fig. 7A is 
the example in which the degree of blur is too smaller as 
compared with the original image and Fig. 7B shows the 
example in which the degree of blur is excessive. 
[001 3] According to the conventional method as men- 
tioned above, it is extremely difficult to perform a natural 
image synthesization while accurately reproducing the 
boundary portion. 

SUMMARY OF THE INVENTION 

[0014] In consideration of the above circumstances, it 
is a concern of the invention to provide an image 
processing apparatus and its method which can extract 
an object so as to obtain an accurate and natural image 
synthesization. 

[001 5] According to one preferred embodiment of the 
invention, there are provided an image processing 
apparatus and its method, wherein image data is input- 
ted, the inputted image data is divided into blocks each 
constructed by a plurality of pixels, a motion of the 
image data is detected every block, and at least the 
image data of a first object and the image data of a sec- 
ond object are classified from the image data in accord- 
ance with a detection result. 

[001 6] According to another preferred embodiment of 
the invention, there are provided an image processing 
apparatus and its method, wherein image data is input- 
ted, the image data is classified into at least a pixel of an 
area of a first object a pixel of an area of a second 
object, and a pixel of a boundary area existing at a 
boundary between the area of the first object and the 
area of the second object, shape information to identify 
the area of the first object, the area of the second object, 
and the boundary area is formed, the classified image 
data and the formed shape information are encoded, 
and 

wherein the shape information is information 
showing at which mixture ratio the pixels of the classi- 
fied boundary area are constructed with the pixels of the 
area of the first object and the pixels of the area of the 



second object 

[0017] Other features and advantages of the invention 
will become apparent from the following detailed 
description taken in conjunction with the accompanying 
5 drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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Figs. 1 A, 1 B, 1 C and 1 D are a diagram showing an 
original image and enlarged diagrams of blocks 
around a boundary between a foreground area and 
a background area in the original image; 
Fig. 2 is a partial enlarged diagram of the block 
around the boundary in Figs. 1A to ID .and a char- 
acteristics diagram showing the relation between a 
luminance level and a pixel position; 
Figs. 3 A, 3B, 3C and 3D are a diagram showing a 
synthesized image (including a boundary area) of a 
foreground of the original image and another back- 
ground and enlarged diagrams of blocks around a 
boundary between a foreground area and a back- 
ground area in the synthesized image; 
Fig. 4 is a partial enlarged diagram of the blocks 
around the boundary in Figs. 3A to 3D and a char- 
acteristics diagram showing the relation between a 
luminance level and a pixel position; 
Figs. 5A. 5B, 5C and 5D are a diagram showing a 
synthesized image (not including a boundary area) 
of the foreground of the original image and another 
background and enlarged diagrams of blocks 
around a boundary between a foreground area and 
a background area in the synthesized image; 
Fig. 6 is a partial enlarged diagram of the block 
around the boundary in Figs. 5A to 5D and a char- 
acteristics diagram showing the relation between a 
luminance level and a pixel position; 
Figs. 7A and 7B are partial enlarged diagrams of 
the block around a boundary between the fore- 
ground area and the background area when a titter- 
ing process is performed to the synthesized image 
in Figs. 5A to 5D and characteristics diagrams 
showing the relation between the luminance level 
and the pixel position; 

Fig. 8 is a block diagram showing a construction of 
an image encoding apparatus of an embodiment 
according to the invention; 

Fig. 9 is a block diagram showing a construction of 
an image decoding apparatus of an embodiment 
according to the invention; 

Fig. 10 is a flowchart shewing an algorithm of a 
whole image process in the embodiment according 
to the invention; 

Fig. 1 1 is a flowchart showing the operation to clas- 
sify the foreground area, background area, and 
boundary area in the embodiment aocorcfing to the 
invention; 
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Figs. 12A and 12B are diagrams for explaining the 
motion of an object between frames according to 
the embodiment; 

Fig. 13 is a diagram for explaining an (8 x 8)-block 
division in the embodiment; 
Figs. 14A, 14B, 14C, 14D and 14E are diagrams for 
explaining a calculating method of a motion vector 
in the embodiment; 

Fig. 15 is a diagram showing a classification result 
of each block in the embodiment; 
Fig. 16 is a diagram showing an (8 x 8)-boundary 
block in the embodiment; 

Fig. 17 is a diagram for explaining a (4 x 4)-block 
forming process of a boundary block in the embod- 
iment; 

Fig. 18 is a diagram showing a motion vector calcu- 
lation result in a (4 x 4)-block in the embodiment; 
Fig. 19 is a diagram shewing a (4 x 4)-boundary 
block in the embodiment; 
Fig. 20 is an enlarged diagram of Fig. 19; 
Fig. 21 is a diagram for explaining a (2 x 2)-block 
forming process of the boundary block in the 
embodiment; 

Fig. 22 is a diagram for explaining a (1 x 1)-btock 
forming process of the boundary block in the 
embodiment; 

Fig. 23 is a diagram showing an image extracted as 
a foreground in the embodiment; 
Fig. 24 is a diagram for explaining the motion of an 
object between frames according to the embodi- 
ment; 

Fig. 25 is a diagram for explaining a block division in 
the embodiment; 

Fig. 26 is a diagram for explaining a calculating 
method of a motion vector in the embodiment; 
Fig. 27 is a diagram showing an image extracted as 
a foreground in the embodiment; 
Fig. 28 is a flowchart showing an algorithm for gen- 
erating a texture and shape information in the 
embodiment accorcfing to the invention; 
Fig. 29 is a constructional diagram showing an 
example of boundary pixels; 
Fig. 30 is a characteristics diagram showing the 
relation between a luminance level and a pixel posi- 
tion; 

Fig. 31 is a diagram for explaining the formation of 
the shape information; 

Fig. 32 is a flowchart showing an algorithm for a 
synthesizing process in the embodiment; 
Figs. 33 A, 33B, 33C and 33D are a diagram show- 
ing a synthesized image of a foreground of an orig- 
inal image and another background in the 
embodiment according to the invention and 
enlarged diagrams of blocks around a boundary 
between a foreground area and a background area 
in the synthesized image; 

Fig. 34 is a partial enlarged diagram of the blocks 
around the boundary in Figs. 33A to 33D and a 



10 



15 



characteristics diagram showing the relation 
between a luminance level and a pixel position; 
Fig. 35 is a flowchart showing an algorithm for a- 
boundary process in the embodiment- 
Fig. 36 is a flowchart showing an algorithm for a 
synthesizing process in the embodiment; 
Fig. 37 is a constructional diagram showing an 
example of the boundary process in the embodi- 
ment; and 

Fig. 38 is a characteristics diagram showing the 
relation between a luminance level and a pixel posi- 
tion in the embodiment. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 



[0019] An embodiment of the invention will now be ' 
described hereinbelow with reference to the drawings. 
[0020] Fig. 8 is a diagram showing a whole construe- 
20 tion of an image encoding apparatus of the embodiment 
according to the invention. 

[0021] First, a motion image of a predetermined for- 
mat is fetched by an image input unit 101. When an 
input is an analog signal, the analog signal is A/D con- 
25 verted to digital data. In case of a color image, the cojor 
image is divided into a luminance signal and two color 
difference signals and a similar process is executed to 
each of those signals, respectively. 
[0022] Although a texture forming unit 105 and a 
30 shape information forming unit 106 are necessary to 
encode an object, those data is formed on a pixel unit 
basis. The data obtained by a foreground area extrac- 
tion unit 102 is stored as it is as texture data. As for the 
shape information, a value indicative of foreground data 
35 is inputted. As data obtained by a boundary area extrac- 
tion unit 103, the data of a foreground area is used to 
form a texture. A value calculated and outputted from 
the foreground area extraction unit 102 and background 
area extraction unit 104 is used as shape information. In 
40 case of encoding a foreground object, the data derived 
by the background area extraction unit 104 is not 
directly used as texture data. The details Of those 
processing algorithms will be described hereinlater. 
[0023] The texture data and the shape information 
45 data are processed by a texture encoding unit 107 and 
a shape information encoding unit 109, respectively. A 
motion compensation unit 1 08 is necessary when differ- 
- ential data between the frames or f ielcte is used. Those 
>erax>0ed data is collected at a system layer and multi- 
so plexed by a multiplexing unit 1 10. In case of collectively 
transmitting a plurality of objects, the processes so far 
are time divisionally executed and the resultant objects 
are multiplexed to one bit stream by the multiplexing unit 
110. The one bit stream is recorded onto a medium 
55 such as optical disk, video tape, or the like by a record- 
ing unit 111. 

[0024] Fig. 9 is a whole constructional diagram of an 
image decoding apparatus of the embodiment accord- 
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ing to the invention and shows processes which are fun- 
damentally opposite to those in Fig. 8. The encoded 
data reproduced by a reproduction unit 200 is supplied 
to a separation unit 201, by which the multiplexed data 
is separated from the encoded data. A decoding proc- 5 
ess is time divisionally performed by a texture decoding 
unit 202. A decoding process is successively performed 
by using the data decoded by a shape information 
decoding unit 204 and, if a motion compensation has 
been performed by a motion compensation unit 203, by w 
using the motion compensated data. In a synthesization 
processing unit 205, a plurality of objects decoded on 
the basis of the description at the system layer are syn- 
chronously reconstructed. An image output unit 206 
forms output data in accordance with a desired format. 15 
[0025] A flow of the data in the respective units 1 02 to 
106 in Fig. 8 will now be described in detail with refer- 
ence to a flowchart of Fig. 10. Fig. 10 shows a whole 
algorithm of the above portion. First in step S301 . an ini- 
tial setting is performed. The number of frames as sub- 20 
jects to be processed, the number of the frame which is 
first used as a subject, a search range when a motion 
vector is obtained, and the like are specified. 
[0026] In step S302, the frame as a subject is divided 
into blocks. In case of a color image, each frame is 
divided into blocks. Although a process by only the lumi- 
nance signal is possible, a result of higher precision can 
be derived by adding a process of the color difference 
signals. 

[0027] In step S303, a motion vector is detected 
between the sampling frames. This detection is exe- 
cuted with respect to alt of the blocks and a sampling 
frame is changed as necessary and the motion vector 
detection is further performed. 

[0028] The motion vectors are classified in step S304 
on the basis of a large quantity of motion vector data 
obtained as mentioned above. As a discriminating 
method, it is sufficient to set the motion vector having 
the largest motion vector value to a background object 
portion and to use the motion vector having the second 
largest motion vector value to a foreground object por- 
tion. A boundary block exists at the position sandwiched 
by the foreground block and the background block. In 
the classification of the motion vectors, there are a case 
where they can be classified from one sampling frame 
and a case where they are classified from a plurality of 
sampling frames. 

[0029] H the motion vectors can be classified on a 
block unit base, they are further divided on a pixel unit 
basis. Ail of the pixels in the block of each of the fore- 
ground block and the background block can be 
regarded as the pixels in the same classification. Only 
the boundary block is selected in step S305 and is clas- 
sified on a further fine unit basis in step S306. By con- 
verging the foreground portion and the background 
portion in the block, a boundary area can be decided. 
By determining the foreground and the background 
from a plurality of sampling frames, a boundary area 



can be decided at high precision. 
[0030] A check is made in step S307 to see rf the proc- 
esses rave been f inished for all of the blocks. A check is 
further made in step S308 to see if the processes have 
been finished for all of the frames. 
[0031 ] At a time point when the processes of all of the 
subject frames are finished, the boundary area is 
decided in step S309 and a state where the objects can 
be separated is obtained. 

[0032] Subsequently, a texture and shape information 
are formed for all of the frames and all of the pixels in 
step S3 10. 

[0033] The classifying processes of the foreground 
area, background area, and boundary area will now be 
described further in detail with reference to a flowchart 
of Fig. 11. 

[0034] As shown in Figs. 12A and 12B, a case where 
a foreground object 301 has been moved to another 
location within a time of one frame (from a previous 
frame of Fig. 12A to a current frame of Fig. 12B) will be 
described as an example. It is assumed that a back- 
ground 302 is not moved. 

[0035] In Fig. 1 1 , first, an image of the current frame 
shown in Fig. 12B is divided into a plurality of blocks. 
That is, as first step S101 . an initial value of a size of an 
(N x N)-block is determined. Explanation will now be. 
made on the assumption that the initial value is equal to 
N = 8. namely, one block has a size of (8 x 8). After the 
block size was initialized, the image is divided into 
30 Hocks in step S102. Fig. 13 is a diagram showing the 
block formed image at the initial stage. 
[0036] Subsequently, in step S1 03, a motion vector is 
calculated with regard to each of the divided blocks. As 
a method of calculating the motion vectors, there is a 
35 general method called a pattern matching such that the 
same image as the image of the subject block is 
searched from the previous frame. The pattern match- 
ing method will now be descrfoed with reference to Figs. 
14A to 14E. As shown in Fig. 14A, explanation will now 
40 be made with respect to typical blocks 501, 502, and 
503 among the blocks of the current frame as an exam- 
ple. 

[0037] In case of an image such as a background 
which is not moving like a block 501 , if the same image 
45 as the image of this block is searched, a block 504 in 
Fig. 14B corresponds to such an image. As shown in 
Fig. 14D, if the images of Figs. 14A and 14B are over : 
lapped and considered, since the positions of the blocks 
501 and 504 on the screen are the same, a motion vec- 
50 tor is equal toO (zero). 

[0038] In case of an image of the foreground object 
301 which was moved in parallel between the frames 
like a block 503, if the same image as the image of the 
block 503 is searched from the previous frame, a block 
55 505 in Fig. 14C corresponds to such an image. As 
shown in Fig. 14D, rf the images of Figs. 14A and 14C 
are overlapped and considered, a motion vector 508 is 
obtained from the positional relation of the blocks 503 
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and 505. When the foreground object 301 includes a 
plurality of blocks like blocks 503. 506, and 507 in Fig. 
14E ( their motion vectors 508, 509. and 510 are the 
same. 

[0039] Further, when the block includes both of the 
foreground image and the background image like a 
block 502, the image of the same pattern cannot be 
found out in the previous frame. In the ordinary pattern 
matching, since the motion vector is calculated from the 
block in which the least square error is the smallest 
between the pixels in the searched range, in the block 
502, a vector value which is different from both of the 
motion vector in the object and the background motion 
vector is calculated. 

[0040] At the stage where the motion vectors of all of 
the blocks included in one current frame image are 
obtained, each block is classified into the background 
block, foreground block, or boundary block of the fore- 
ground and the background. That is, first in step S104, 
a check is made to see if the motion vector value of the 
background block has already been decided. If it is not 
yet determined, step S105 follows and a motion vector 
value Vb of the background block has to be decided. 
Step S105 is a processing routine to decide the motion 
vector value Vb of the background block. However, if it 
has previously been known that the background image 
is not moved, it is sufficient to set 0 (zero) to the motion 
vector value Vb of the background block. 
[0041 ] If the motion vector value Vb of the background 
block is decided, in next step S106, a comparison 
between the motion vector value of each block and the 
decided motion vector value Vb of the background block 
is performed for each block in the current frame, thereby 
discriminating whether the relevant block is the back- 
ground block or not. With respect to the block in which 
the values almost coincide as a result of the comparison 
between the vector values, it is processed as a back- 
ground block in step S107. 

[0042] When the user wants to extract a foreground 
image (for example, in case of the foreground area 
extraction unit 102 in Fig. 8). the background block can 
be rejected. However, if the user wants to extract the 
background image (for instance, in case of the back- 
ground area extraction unit 104 in Fig. 8), it is necessary 
to accumulate the data into a memory (not shown) in the 
background area extraction unit 104 until all of the back- 
ground blocks are collected. 

[0043] In next step S108. a check is made to see rf the 
motion vector value to deckle the foreground block has 
already been determined. If such a motion vector value 
is not decided, it has to be obtained- As mentioned 
above, since all of the Mocks in the foreground object 
301 have the same motion vector value, it is sufficient to 
set this value to the motion vector value Va of the fore- 
ground block in step S109. 

[0044] When a motion vector value Va of the fore- 
ground block is determined, in step S1 10. a comparison 
between the motion vector value of each Mock and the 
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decided motion vector value Va of the foreground block 
is performed to each block in the current frame, thereby 
discriminating whether the relevant block is the fore- 
ground block or not. Actually, even in case of the bound- 
5 ary block, if the motion vector value is almost the same, 
as the motion vector value of the foreground block, it is 
decided as a foreground block. The block in which the 
value almost coincides as a result of comparison 
between the vector values is processed as a foreground 
to block in step S1 1 1 . 

[0045] If it is determined in steps S1 06 and S1 1 0 that 
the relevant block is none of the background block and 
the foreground block, such a block denotes the bound- 
ary block. As mentioned above, a plurality of blocks of 
is the (8 x 8) size included in one current frame image can 
be classified into three kinds of background block, fore- 
ground block, and boundary block from the calculated * 
motion vector value. A classified result is shown in Fig. 
15: Fig. 16 is a diagram showing only the extracted 
20 boundary block. 

[0046] After completion of the classifying process as 
mentioned above, the block size of the boundary block 
is further divided into the half in the vertical and lateral 
directions in step Si 12, namely, N = 4. The processing 
25 routine is returned to step S102 and the divided block is 
again divided into blocks. Each of a plurality of bound- 
ary blocks shown in Fig. 1 6 is further again divided into 
blocks of the size of (4 x 4) and a result is shown in Fig. 
17. In step S103, a motion vector is calculated with 
30 respect to each of the boundary blocks which were 
again divided into the blocks of the size of (4 x 4). A 
result of calculation of the motion vector of the size of (4 
x 4) is shown in Fig. 18. 

[0047] In a manner similar to the above, processes in 
35 steps S1 04 to S1 12 are executed for the motion vector 
of the (4 x 4) block size as a subject However, since the 
motion vector values Vb and Va to decide the back- 
ground block and the foreground block have already 
been obtained, the processes in steps S105 and S109 
40 are unnecessary. Fig. 19 shows the portion which te 
decided as a boundary block whose block size is equal 
to (4 x 4). Fig. 20 is an enlarged diagram of Fig. 19. The 
boundary block is further again divided into blocks of the 
size of (2 x 2). The processes in the above steps are 
45 repeated. 

[0048] After that, all of the blocks are classified into 
the background block and the foreground block and if 
there is no boundary block, the separating process of 
the foreground and background is finished. Since the 
so minimum unit of the block size is equal to (1 x 1). this 
block is certainly classified into either the background 
block or the foreground block Fig. 21 is a diagram 
showing a state when the boundary block is divided into 
blocks of (2 x 2). Fig. 22 is a diagram showing a state 
55 when the boundary block is further divided into the 
blocks of (1 x 1). When the foreground portion is 
extracted as a result in which all of the blocks were clas- 
sified into the foreground and the background as men- 
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tioned above, the foreground image separated from the 
background is obtained as shown in Fig. 23. 
[0049] As mentioned above, according to the separat- 
ing process of the foreground and the background of the 
embodiment, the current frame of the motion image is s 
divided into a plurality of blocks, the motion vector of 
each block is calculated between the images which are 
different with respect to the time, and the separation to 
the background image and the foreground image is per- 
formed by using the calculation result. Therefore, the 10 
inconvenience such that the blue image portion other 
than the background is erroneously recognized as a 
background as in the conventional blue back system 
can be prevented. The separation to the background 
and the foreground can be performed at a high speed by is 
the simple algorithm such as the use of the calculated 
motion vector. The real-time performance of the motion 
image can be assured. A construction of the apparatus 
is also simplified. 

[0050] Although the above embodiment has been 20 
described with respect to the case where the image of 
the background portion does not move, an example of a 
separating process in the case where the background 
portion also moves will now be described hereinbelow. 
In this embodiment, although a fundamental algorithm 25 
is similar to that in Fig. 11. a processing routine to 
decide the motion vector value Vb of the background 
block in step S105 differs from that in the foregoing 
embodiment. Images on the screen will be explained 
with reference to Figs. 24 to 27. 30 
[0051] The embodiment will be described with respect 
to an example in which the foreground object moves 
from the lower position toward the upper position on the 
screen and the background moves from the right side 
toward the lower left side on the screen as shown in Fig. 35 
24. 

[0052] Fig. 25 is an explanatory diagram of the block 
formation in step S102 in Fig. 11. Explanation will now 
be made by limiting to the foreground object and the 
blocks around it. By calculating motion vectors of those 40 
blocks in step S103 in Fig. 1 1 , motion vectors as shown 
in Fig. 26 are obtained. 

[0053] Since there is no boundary block in the exam- 
ple of Fig. 26, the motion, vectors can be classified into 
two kinds. One of the kinds relates to the motion vector 45 
Vb of the background block and the other relates to the 
motion vector Va of the foreground block. In the embod- 
iment the motion vectors having a larger occurrence 
frequency are decided as motion vectors Vb of the 
background block in step S105 in Fig. 1 1. In this case, so 
motion vectors 601 in Fig: 26 are the motion vectors Vb 
of the background block. 

[0054] Even if other almost uniform motion vectors 
surrounding the portion in which motion vectors 602 of 
the same value are collected are set to the motion vec- ss 
tors Vb of the background block, the same result will be 
obtained in this example. 

[0055] Since the boundary block is not included in this 



example, the processing routine is finished by one loop 
in steps S101 to S111 in Fig. 11. However, when a 
boundary block exists, the boundary block is classified 
into further small blocks in step S1 12 and the processes 
from step S102 are repeated. Until the block is divided 
into the minimum unit (namely, block of the size of (1 x 
1)), the motion vectors Va and Vb of the foreground 
block and the background block are certainly obtained. 
Therefore, the processing routine is finished at a stage 
where all of the blocks have either one of the motion 
vector values. Fig. 27 is a diagram showing a fore- 
ground image extracted by this algorithm. 
[0056] According to the embodiment as mentioned 
above, not only in the case where the background 
image is a still image but also even in case of a motion 
image which moves, the separating process of the 
background and the foreground can be certainly per- 
formed at a high speed. 

[0057] In the above embodiment, the background 
image and the foreground image have been separated 
from the motion vector calculated every block. However, 
the invention is not limited to this method. For instance, 
the invention can be also similarly applied to the case of 
separating a portion which is moving on the screen and 
a portion which is not moving. Image portions having 
different motion vector values can be also separated, 
respectively. 

[0058] In the above embodiment, whether the relevant 
block is the background block or the foreground block 
has been discriminated every divided block and in the 
case where the relevant block is none of the background 
block and the foreground block, it is determined that the 
block is the boundary block. However, the discrimination 
about the boundary block can be also realized by check- 
ing whether the relevant block is neighboring to the 
background block or not Even if the relevant block is 
adjacent to the background block, so long as the motion 
vector of this block is the same as that of the internal 
block of the foreground, it is determined that the relevant 
block is the foreground block. 

[0059] An algorithm to form the texture and shape 
information in the embodiment according to the inven- 
tion will now be described in detail with reference to Fig. 
28. 

[0060] First in step S401, a check is made to see if the 
pixel which is a subject at present is the pixel in the fore- 
ground area. If YES, its value is stored as it is as texture 
data in step S402. The shape information is further 
determined in step S403. It is now assumed that the 
shape information is shown by a state of 8 bits, (cc = 
255) denotes a foreground portion of 100%. and (a = 0) 
denotes a background portion of 1 00%. 
[0061] In step S404, a check is made to see if the pixel 
which is at present a subject is the pixel in the back- 
ground area. If YES. a padded value is used as texture 
data in step S405. When the foreground object is 
encoded, since the background image data is unneces- 
sary, desired data can be filled into the background por- 
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tion. To raise an encoding efficiency, the operation to 
repetitively fill the data of an edge portion of the fore- 
ground or to fill a predetermined value is executed. The 
shape information in step S406 is set to (a = 0). 
[0062] A check is subsequently made in step S407 to s 
see if the pixel which is at present a subject is the pixel 
in the boundary area. If YES, a foreground area pixel at 
the position that is the nearest to such a pixel is 
obtained in step S408. The value of the foreground area 
pixel is set to the texture data in step S409. In step 10 
S410. the background area pixel at the position that is 
the closest to such a pixel is also obtained. The shape 
information is calculated in step S41 1 on the basis of 
those pixel values. Now, assuming that the value of the 
foreground pixel at the nearest position from the bound- 75 
ary pixel is labeled to A and the value of the background 
pixel is labeled to B and the value of the boundary pixel 
is labeled to M, shape information a of the boundary 
area is obtained by the following equation. 

20 

a = 255 •■ (M - B)/(A - B) (1) 

[0063] A specific example in this case will now be 
described with reference to Figs. 29 and 30. Fig. 29 
shows an example of pixels near the boundary area. 
Reference numeral 701 denotes foreground pixels, 702 
and 703 boundary pixels, and 704 background pixels. 
The foreground pixels at the position that is the nearest 
to the boundary pixels 702 are the pixels 701 . The back- 
ground pixels at the positions that is the closest to the 
boundary pixels 702 are the pixels 704. The same shall 
also similarly apply to the boundary pixels 703. Fig. 30 
shows luminance levels of the pixels 701 to 704. 
[0064] Now, assuming that a value of the foreground 
pixel 701 is equal to 250, a value of the boundary pixel 

702 is equal to 220, a value of the boundary pixel 703 is 
equal to 120, and a value of the background pixel 704 is 
equal to 100, the shape information in the boundary 
pixel 702 is obtained as follows. 

a = 255 • (220 - 100)/(250 - 100) = 204 (2) 

[0065] The shape information in the boundary pixel 

703 is obtained as follows. 

a = 255 • (120 - 100)/(250 - 100) = 34 (3) 

[0066] By repeating the processes as mentioned 
above, a check is made in step S412 to see if the proc- 
esses for all of the pixels have been finished. Further, a 
check is made in step S413 to see if the processes for 
all of the frames have been finished. This processing 
routine is finished. Fig. 31 is a diagram for explaining the 
formation of the shape information in the example of 
Fig. 2 mentioned above. Since the shape information is 
the 8-bit data, the position of 0% in Fig. 31 is set to (a = 
0) and the position of 100% is set to (a = 255). 
[0067] An algorithm for the synthesizing process will 



now be described with reference to Fig. 32. As will be 
obviously understood from the explanation of Hg. 28, in 
the present system, since all of the pixels have the pixel 
value and the shape information as a set, the algorithm 
for the synthesizing process is simple. 
[0068] First in step S701 , the shape information is dis- 
criminated and the pixel value of the display is deter- 
mined in step S702. 

[0069] Now, assuming that the value of the foreground 
pixel is set to A and the value of the background pixel is 
set to B and the pixel value to be obtained is set to M, M 
is expressed as follows. 

M =: A • (a/255) + B • (1 - a/255) (4) 

The above processes are repeated for all of the pixels. 
When it is decided in step S703 that those processes 
have been finished, the synthesizing processing routine 
is finished. 

[0070] Fig. 33A is a diagram showing a synthesized 
image of another foreground image and a background 
image in the invention. 

[0071] Figs. 33B to 33D are enlarged diagrams of 
blocks 2001 to 2003. Fig. 34 is an enlarged diagram of 
a luminance level on an A- A' line in Fig. 33C. As will be 
obviously understood from those diagrams, even in the 
synthesizing process in which the background object is 
changed, there is no unnaturalness in the outline por- 
tion and a degree of blur is smooth in a manner similar 
to the original image. 

[0072] According to the embodiment, when the sub- 
ject object is extracted from the motion image and is 
synthesized to another image, by obtaining the motion 
vector between the frames of the information in the 
boundary area, the object is separated into a perfect 
subject object area, a perfect background area, and a 
boundary area in which both of those areas mixedly 
exist. By also adding the shape information to the 
extracted image data, each area can be discriminated. 
At the time of an image synthesis, the pixel value of the 
boundary area is again calculated from the shape infor- 
mation. 

[0073] With the above construction, the extraction of 
the object which can be easily re-processed and has a 
high generality can be easily and certainly performed. 
[0074] Another embodiment of the forming process of 
the shape information will now be described. 
[0075] The embodiment uses an algorithm obtained 
by simplifying the algorithm to form the shape informa- 
tion and processes other than the processes in steps 
S407 to S41 1 in Fig. 28 are substantially the same as 
those mentioned above. Processes which are replaced 
to steps S407 to S41 1 will now be described with refer- 
ence to Fig. 35. When the boundary pixel is determined 
in step S2301 , an arbitrary value is set as texture data in 
step S2302. A padding can be also executed in a man- 
ner similar to the process for the background area in 
consideration of an encoding efficiency. In step S2303, 
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a flag is set into the shape information. Any value can 
be used so long as the boundary area can be discrimi- 
nated. That is. only the position information of the 
boundary area is stored here. 

[0076] Fig. 36 is an algorithm for a synthesizing proc- s 
ess in another embodiment. Fig. 36 differs from Fig. 32 
with respect to a point that since there is no data in the 
boundary area, a forming process of this portion is 
added. When the pixel of the boundary area is decided 
in step S2401, a foreground area pixel at the position w 
that is the nearest to such a pixel is obtained in step 
S2402. A background area pixel at the position that is 
the nearest to the pixel of the boundary area is obtained 
in step S2403. A value of the boundary pixel to be dis- 
played is obtained by using the distances to the two pix- is 
els of the foreground pixel and the background pixel and 
the pixel values in step S2404. 
[0077] It is now assumed that a value of the fore- 
ground pixel at the position which is the nearest from 
the boundary pixel is set to A. a value of the background so 
pixel is set to B. a distance to the foreground pixel is set 
to a. and a distance to the background pixel is set to b. 
A value M of the boundary pixel is obtained by the fol- 
lowing equation. 

25 

M = (A*b + B*a)/(a + b) (5) 

[0078] Specific examples will now be described here- 
inbelow with reference to Figs. 37 and 38. Reference 
numeral 2501 in Fig. 37 denotes foreground pixels. 30 
2502 and 2503 pixels of the boundary area, and 2504 
background pixels. Fig. 38 shows luminance levels of 
the pixels 2501 to 2504. Pixel values of the pixels 2502 
and 2503 of the boundary area are calculated from 
those two data. 35 
[0079] First the foreground pixel at the position that is 
the nearest to the pixel 2502 of the boundary area is the 
pixel 2501 . its value is A = 250. and its distance is a = 1 . 
The background pixel at the position that is the nearest 
to the pixel 2502 is the pixel 2504. its value is B - 100, 40 
and its distance is b = 2. Therefore, the pixel value of the 
pixel 2502 is as follows. 

M = (250*2 + 100*1)/(1 + 2) = 200 (6) 

45 

[0080] Similarly, the foreground pixel at the position 
that is the nearest to the pixel 2503 of the boundary 
area is the pixel 2501. its value is A = 250. and its dis- 
tance is a = 2. The background pixel at the position that 
is the nearest to the pixel 2503 is the pixel 2504. its so 
value is B = 100. and its distance is b = 1 . Therefore, the 
pixel value of the pixel 2503 is as follows. 

M = (250*1 + 100*2)/(1 + 2) = 150 (7) 

55 

[0081] The image processing apparatus of the inven- 
tion can be applied to a system constructed by a plural- 
ity of apparatuses (for example, a host computer. 



interface equipment, a reader, a VTR. a TV. a printer, 
etc.) or can be also applied to an apparatus comprising 
one apparatus (for instance, a digital TV camera, a per- 
sonal computer, a copying apparatus, or a facsimile 
apparatus). 

[0082] A construction such that in order to make vari- 
ous devices operative so as to realize the functions of 
the foregoing embodiment, program codes of software 
to realize the functions of the embodiments are supplied 
to a computer in an apparatus or system which is con- 
nected to the various devices and the various devices 
are made operative in accordance with the programs 
stored in the computer (CPU or MPU) of the system or 
apparatus, thereby embodying the invention is also 
incorporated in the purview of the invention. 
[0083] In this case, the program codes themselves of 
the software realize the functions of the foregoing 
embodiments. The program codes themselves and 
means for supplying the program codes to the compu- 
ter, for example, a memory medium in which the pro- 
gram codes have been stored construct the invention. 
As a memory medium to store the program codes, it is 
possible to use any one of. for example, a floppy disk, a 
hard disk, an optical disk, a magnetooptical disk, a CD- 
ROM, a magnetic tape, a non-volatile memory card, an 
ROM, and the like. 

[0084] It will be obviously understood that not only in 
the case where the functions of the foregoing embodi- 
ments are realized by executing the supplied program 
codes by the computer but also in the case where the 
functions of the foregoing embodiments are realized by 
the program codes in cooperation with the OS (Operat- 
ing System) which is operating in the computer, another 
application software, or the like, the program codes are 
included in the embodiments of the invention. 
[0085] Further, it will be also obviously understood 
that a case where after the supplied program codes 
were stored into a memory provided for a function 
expanding board of a computer or a function expanding 
unit connected to the computer, a CPU or the like pro- 
vided for the function expanding board or function 
expanding unit executes a part or all a? the actual proc- 
esses on the basis of instructions of the program codes, 
and the functions of the foregoing embodiments are 
realized by the above processes is also incorporated in 
the invention. 

[0086] In other words, the foregoing description of 
embodiments has been given for illustrative purposes 
only and not to be construed as imposing any limitation 
in every respect 

[0087] The scope of the invention is. therefore, to be 
determined solely by the following claims and not lim- 
ited by the text of the specifications and alterations 
made within a scope equivalent to the scope of the 
claims fall within the true spirit and scope of the inven- 
tion. 
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Claims 

1 . An image processing apparatus comprising: 

a) input means for inputting image data; 

b) dividing means for dividing the image data 
inputted by said input means into blocks con- 
structed by a plurality of pixels; 

c) detecting means for detecting a motion of 
said image data every said block; and 

d) classifying means for classifying at least 
image data of a first object and image data of a 
second object from said image data in accord- 
ance with an output of said detecting means. 

2. An apparatus according to claim 1 , wherein said 
classifying means classifies a boundary block exist- 
ing at a boundary between the image data of said 
first object and the image data of said second 
object and further again divides said boundary 
block into blocks of a smaller size and executes a 
classification of objects. 

3. An apparatus according to claim 1, wherein said 
classifying means classifies a foreground block and 
a background block every said block. 

4. An apparatus according to claim 3. wherein said 
classifying means classifies the block which hardly 
has a motion as said background block. 

5. An apparatus according to claim 3, wherein said 
classifying means classifies the blocks having an 
almost uniform motion as said background block. 

6. An apparatus according to claim 3, wherein among 
the blocks having an almost uniform motion, said 
classifying means classifies each of the blocks 
arranged so as to surround the other blocks as said 
background block. 

7. An apparatus according to claim 6, wherein said 
classifying means classifies each of said other 
blocks having the almost uniform motion as said 
foreground block. 

8. An apparatus according to claim 3, wherein among 
the blocks having an almost uniform motion, said 
classifying means classifies each of the blocks in 
which the number of said almost uniform blocks is 
the maximum as said background block 

9. An apparatus according to claim 3. wherein among 
the blocks having an almost uniform motion, said 
classifying means classifies each of the blocks 
other than the blocks in which the number of said 
almost uniform blocks is the maximum as said fore- 
ground block. 
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10. An apparatus according to claim 2, wherein said 
classifying means classifies a foreground block and 
a background block every said block. 

5 11. An apparatus according to claim 10, wherein after 
the block size of said boundary block was changed, 
when the block classification is again executed, 
said classifying means uses a motion of the block 
which has first been determined to be the back- 
to ground block or a motion of the block which has first 
been decided to be the foreground block for a dis- 
crimination of the classification of said boundary 
block. 

is 1 2. An apparatus according to claim 1 , further compris- 
ing encocfing means for encoding every said object 
classified by said classifying means. 

13. An apparatus according to claim 1, wherein said 
20 classifying means classifies a block of said first 

object, a block of said second object, and a bound- 
ary block existing at a boundary between the image 
data of said first object and the image data of said 
second object. 

25 

14. An apparatus, according to claim 13, further com- 
prising pixel classifying means for classifying the 
pixels of an area of said first object, the pixels of an 
area of said second object, and the pixels of an 

30 area of said boundary in accordance with an output 
of said classifying means. 

15. An apparatus according to claim 14, further com- 
prising image data forming means for forming the 

35 image data of the area of said first object and the 
image data of the area of said boundary in accord- 
ance with an output of said pixel classifying means. 

16. An apparatus according to claim 15, further com- 
40 prising shape information forming means for form- 
ing shape information to identify the area of said 
first object, the area of said second object, and the 
area of said boundary in accordance with the out- 
put of said pixel classifying means. 

45 

17. An apparatus according to claim 15, wherein said 
image data forming means sets the image data of 
the pixels of said boundary area to a pixel value of 
the area of said first object existing at a position that 

so is the nearest to the pixels of said boundary area. 

18. An apparatus according to claim 16, wherein the 
shape information for identifying said boundary 
area is calculated in accordance with the pixel value 

55 of the pixels of said boundary area, the pixel value 
of the area of said first object existing at the position 
that is the nearest to said pixel, and the pixel value 
of the area of said second object existing at the 



10 



19 



EP 0 933 727 A2 



20 



position that is the nearest to said pixel and denotes 
a ratio of the pixel value of the area of said first or 
second object included in the pixel values of said 
. boundary area. 

19. An apparatus according to claim 16, wherein the 
shape information for identifying said boundary 
area is calculated in accordance with a pixel value 
and a distance of the pixels of the area of said first 
object existing at the position that is the nearest to 
the pixels of said boundary area and a pixel value 
and a distance of the pixels of the area of said sec- 
ond object existing at the position that is the nearest 
to the pixels of said boundary area 

20- An apparatus according to claim 16, further com- 
prising synthesizing means for synthesizing the 
image data of said first object to another image data 
by using said shape information. 

21. An apparatus, according to claim 16, further com- 
prising encoding means for encoding the image 
data formed by said image data forming means. 

22. An apparatus according to claim 21 , further com- 
prising shape information encoding means for 
encoding the shape information formed by said 
shape information forming means. 

23. An image processing method comprising the steps 
of: 

a) inputting image data; 

b) dividing said inputted image data into blocks 
constructed by a plurality of pixels; 

c) detecting a motion of said image data every 
said block; and 

d) classifying at least image data of a first 
object and image data of a second object from 
said image data in accordance with a result of 
said detection. 

24. An image processing apparatus comprising: 

a) input means for inputting image data; 

b) classifying means for classifying said image 
data into at least pixels of an area of a first 
object, pixels of an area of a second object, 
and pixels of a boundary area existing at a 
boundary between said first object area and 
said second object area; 

c) forming means for forming shape information 
to identify the area of said first object the area 
of said second object and said boundary area, 
said forming means also forming information, 
as said shape information, showing at which 
mixture ratio the pixels of the boundary area 
classified by said classifying means are con- 
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struct ed with the pixels of the area of said first 
object and the pixels of the area of said second 
object; 

d) image data encoding means for encoding 
s the image data classified by said classifying 

means; and 

e) shape information encoding means for 
encoding the shape information formed by said 
forming means. 

10 

25. An apparatus according to claim 24, further com- 
prising image data forming means for forming the 
image data of the area of said first object and the 
image data of said boundary area in accordance 
is with an output of said pixel classifying means, and 
wherein said image data encoding means encodes 
the image data formed by said image data forming 
means. 

20 26. An apparatus according to claim 25, wherein said 
image data forming means sets the image data of 
the pixels of said boundary area to a pixel value of 
the area of said first object existing at a position that 
is the nearest to the pixels of said boundary area. 

25 

27. An apparatus according to claim 26, wherein the 
shape information for identifying said boundary 
area is calculated in accordance with the pixel value 
of the pixels of said boundary area, the pixel value 

30 of the area of said first object existing at the position 
that is the nearest to said pixels, and the pixel value 
of the area of said second object existing at the 
position that is the nearest to said pixels and 
denotes a ratio of the pixel value of the area of said 

35 first or second object included in the pixel value of 
said boundary area. 

28. An apparatus according to claim 26, wherein the 
shape information for identifying said boundary 

40 area is calculated in accordance with a pixel value 
and a distance of the pixels of the area of said first 
object existing at the position that is the nearest to 
said pixels of said boundary area and a pixel value 
and a distance of the pixels of the area of said sec- 

45 ond object existing at the position that is the nearest 
to the pixels of said boundary area. 

29. An apparatus according to claim 24, further com- 
prising decoding means for decoding said encoded 

so image data. 

30. An image processing method comprising the steps 
of: 

55 a) inputting image data; 

b) classifying said image data into at least pix- 
els of an area of a first object pixels of an area 
of a second object, and pixels of a boundary 
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area existing at a boundary between the area 
of said first object and the area of said second 
object; 

c) forming shape information to identify the 
area of said first object, the area of said second s 
object, and the boundary area, said forming 
step also forming information, as said shape 
information, showing at which mixture ratio the 
pixels of the boundary area classified in said 
classifying step are constructed with the pixels 10 
of the area of said first object and the pixels of 
the area of said second object; 

d) encoding the image data classified in said 
classifying step; and 

e) encoding the shape information formed in is 
said forming step. 
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